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Abstract 

The large deviation principle is proved for a class of L 2 -valued processes that 
arise from the coarse-graining of a random field. Coarse-grained processes of this 
kind form the basis of the analysis of local mean-field models in statistical mechanics 
by exploiting the long-range nature of the interaction function defining such mod- 
els. In particular, the large deviation principle is used in a companion paper || to 
derive the variational principles that characterize equilibrium macrostates in statis- 
tical models of two-dimensional and quasi-geostrophic turbulence. Such macrostates 
correspond to large-scale, long-lived flow structures, the description of which is the 
goal of the statistical equilibrium theory of turbulence. The large deviation bounds 
for the coarse-grained process under consideration are shown to hold with respect 
to the strong L 2 topology, while the associated rate function is proved to have com- 
pact level sets with respect to the weak topology. This compactness property is 
nevertheless sufficient to establish the existence of equilibrium macrostates for both 
the microcanonical and canonical ensembles. 

Key words and phrases: Large deviation principle, Cramer's Theorem, coarse-graining, 
statistical models of turbulence 



1 Introduction 

In many statistical mechanical models coarse-graining is a fundamental construction that 
mediates between a microscopic scale on which the model is defined and a macroscopic 
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scale on which the model is analyzed in a thermodynamic or continuum limit. In this 
general context, the coarse-graining is determined by an averaging procedure on an in- 
termediate scale. Its usefulness relies on the property that the functions defining the in- 
teractions of the model's variables can be well approximated by corresponding functions 
of the averaged variables. Typically, this situation is met in models that have long-range 
interactions and therefore have the character of local mean-field theories. For models of 
this kind, a complete and rigorous analysis of the thermodynamic or continuum limit can 
be carried out once the asymptotic behavior of an appropriate coarse-grained process is 
characterized. Such a characterization is provided by a large deviation principle, which 
expresses in a sharp form the statistical effect of the averaging procedure. 

The well-known models of two-dimensional turbulence [|1^, |15|, [llj, or more generally 
quasi-geostrophic turbulence J|, are prime examples of this general class of local mean- 
field theories. As we explain later, a coarse-grained process for these models is defined 
by taking a certain local average of the underlying microscopic vorticity field. The dy- 
namical invariants, which in these models include the energy and circulation, are then 
expressed as functions of this coarse-grained process via approximations that become ex- 
act in the continuum limit. With this representation in hand, rigorous large deviation 
techniques allow one to deduce, in an essentially intuitive way, the asymptotic behavior 
of the vorticity field in the continuum limit. In fluid dynamical terms, the large deviation 
principle distinguishes certain mean flow structures, which may take the form of jets or 
vortices, as the most probable macrostates against a background of fluctuating, filamen- 
tary microscopic vorticity. In this way the large deviation analysis plays a pivotal role 
in realizing the main goal of such models: to explain the emergence and persistence of 
coherent structures within the turbulent flow. 

For the sake of definiteness, let us consider two-dimensional turbulence in the unit torus 
T 2 = [0, 1) x [0, 1). The underlying Hamiltonian system is governed by the Euler equations 
for an inviscid, incompressible fluid with periodic boundary conditions on the velocity and 
pressure fields. This system is most conveniently described as an evolution equation for 
the vorticity u)(x,t) = 8112/ dx\ — dui/dx2, which is the perpendicular component of the 
curl of the velocity field u = (ui(x,t),u 2 (x,t)), x = (xi,x 2 ) G T 2 . With respect to 
this dynamics the vorticity u is advected, or rearranged, by the incompressible velocity 
field u that it induces instantaneously. Generically, this self-straining motion produces a 
fine-grained vorticity field u that exhibits complex fluctuations on the small spatial scales. 
Statistical equilibrium models are introduced to capture the essential features of the flow 
without resolving the small-scale dynamics. 

In order to construct a statistical equilibrium model, one discretizes the dynamics and 
replaces the fine-grained vorticity field u by a microstate ( defined on a suitable lattice 
C n in T 2 . Specifically, for each n G IN let C n be a uniform lattice of a n = 2 2n sites s in 
T 2 . The intersite spacing in each coordinate direction is 2~ n . Each such lattice of a n sites 
induces a dyadic partition of T 2 into a n squares called microcells, each having area l/a n . 
For each s G C n we denote by M(s) the unique microcell containing the site s in its lower 
left corner. The configuration spaces for the model are y an , where y is a given closed 
subset of M. The elements of y an are the microstates ( = {((s), s G C n }, which we can 
identify with piecewise-constant vorticity fields ( relative to C n ; that is, ((x) = ((s) for 
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all x e M(s), s e C n . 

The probabilistic structure of the discretized microstate ( is chosen to be consistent 
with the postulated behavior of the fine-grained vorticity field u. Specifically, it is deter- 
mined by a probability measure P n defined as follows. Let p be a probability measure on 
JR with support y. If y is unbounded, assume that 



/ e ay p(dy) <oo for all a G JR. (1.1) 

JJR 



We then define P„ to be the product measure on y an with identical one-dimensional 
marginals p. With respect to P n , the collection {C(s)> s £ £n} is a finite family of inde- 
pendent, identically distributed (i.i.d.) random variables having common distribution p 
and common range y. Under a suitable ergodic hypothesis, the given measure p incorpo- 
rates the family of invariants associated with incompressible rearrangement of vorticity. 
Details are discussed in 0, ||, |16| . 

The basic dynamical invariant of the Euler equations is the kinetic energy, which is 
expressible as the following functional of the vorticity: 
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H(lu) = -[ g(x - x')uj(x)uj(x')dxdx'. (1.2) 
2 Jt 2 xt 2 



In this formula g(x — x') is the generalized Green's function for — A on T 2 . In the lattice 
model on C n , H{u) is replaced by a lattice Hamiltonian H n . This is defined for ( G y an 
by 

Hn(0 = ^E 9n(s ~ S')((S)((S') , (1.3) 

s,s'eC 

where g n (s — s') is a certain lattice approximation to the generalized Green's function 
g(x — x'). For instance, g n (s — s') may be determined from a finite-difference discretization 
or a spectral truncation. 

The formalism of equilibrium statistical mechanics provides two joint probability dis- 
tributions for microstates ( G y an : the microcanonical ensemble and the canonical en- 
semble. In probabilistic terms, the microcanonical ensemble expresses the conditioning of 
P n on the energy shell G y an : JJ n {Q) = E}, where E G JR is a specified energy value. 
In order to avoid technical problems with the existence of regular conditional probability 
distributions, we shall condition P n on the thickened energy shell {JJ n (Q G [E — e,E+e]}, 
where e > 0. Thus, the microcanonical ensemble is the measure defined for Borel subsets 
B of y an by 

P n {B} - P n {B | H n G [E - e, E + e)} - g ^ _ - - + ^ . 

Correspondingly, the canonical ensemble is defined for Borel subsets S of 3^ a ™ by 



^{5} = -t^v / exp[-/3F n ] dP n . 
z m, p) Jb 



Here /3 is a real number denoting the inverse temperature, and Z(n,/3) is the partition 
function J yan exp[— /3H n ] dP n , which normalizes the probability measure P n ,p- 



3 



Models of this kind were originally proposed independently by Miller et. al. 
and Robert et. al. |L4|, [15|] in the context of two-dimensional Euler flow. Subsequently 
they were extended to geophysical fluid dynamics, such as barotropic quasi-geostrophic 
flow [|], Q . A model of spin systems on a circle that exhibits an interesting phase transition 
has a similar, but simpler structure 0. 

In the setting of models of two-dimensional and quasi-geostrophic turbulence, there 
are two basic goals of the equilibrium statistical theory: first, to predict the formation of 
stable, coherent flow structures from either the microcanonical ensemble or the canonical 
ensemble; second, to deduce whether the two ensembles yield equivalent results. In order 
to achieve these two goals, which depend on deriving properties of the two ensembles 
in the continuum limit, the crucial innovation is to introduce a two-parameter stochastic 
process that is defined by a coarse-graining, or local averaging, of the microscopic vorticity 
field over an intermediate scale. We now present this key construction. 

Given n G IN and a positive integer r < n, we consider a dyadic partition of the lattice 
C n into 7 r = 2 2r blocks, each block containing a n j^ T lattice sites. In correspondence with 
this partition we have a dyadic partition {D r ^i k = 1, . . . , 7 r } of T 2 into macrocells. Each 
macrocell is the union of a n j^ r microcells M(s). This partition of C n into 7 r blocks 
represents a coarse-graining of the lattice C n . With respect to this partition, we define 
the following coarse-grained process, obtained by a local averaging over the sites of the 
macrocells D r ^: 

W n , r (x) = W nir ((,x) = J2 l Drk (x) S n , rtk (C), (1-4) 

k=l 

where 

SnMO = -^j- E COO- (i.5) 

The doubly indexed process W n>T takes values in the space L 2 (T 2 ). 

The process W ntT has the following two properties, which will allow us to evaluate its 
continuum limit with respect to either the microcanonical or canonical ensemble. 

1. In the double limit n — > oo, r — > oo, with respect to the product measures P n , W Ujr 
satisfies the large deviation principle (LDP) on L 2 (T 2 ) with scaling constants a n 
and an explicitly determined rate function /. 

2. In the double limit n — > oo, r — * oo, the Hamiltonian H n (() is asymptotic to 
H(W ntr (0) uniformly over microstates, where the functional H mapping L 2 (T 2 ) 
into M is defined in ( |1.2j ); in symbols, 

lim lim sup \H n (C)-H(W n>r (C))\ = 0. (1.6) 

r-^oo n^oo ^ya n 

The proof of the two-parameter LDP in item 1 is the main task of this paper. We will 
give a heuristic proof later in this section. In Section 2 we will formulate the LDP for a 
natural generalization of W n ^ r and will prove this in Section 3. 

The verification of ( |1.6[ ) in item 2 can be carried out as in |1|, §4.2], where a similar 
approximation is verified. Essentially, this approximation depends on the fact that the 
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vortex interactions governed by H n are long-range, being determined essentially by the 
Green's function g(x — x'). For this reason, H n is not sensitive to the small-scale structure 
of the vorticity field and depends only on the local mean vorticity in the continuum limit. 
In other words, H n is well approximated by a function of the coarse-grained process 
W n)1 .. This kind of behavior is typical of local mean-field theories. The turbulence models 
under consideration here have the property that their local mean-field approximations are 



asymptotically exact [13 



In the next part of this section, we motivate the two-parameter LDP in item 1. Later 
in Section 5 we indicate how items 1 and 2 together allow one to evaluate the continuum 
limit of W n , r with respect to the microcanonical ensemble and the canonical ensemble. 
These limits are expressed in terms of variational formulas whose solutions correspond 
to coherent structures for the two ensembles. There we also discuss the question of 
equivalence and nonequivalence of ensembles. 

In order to motivate the LDP in item 1, we note that with respect to P n the normalized 
sums S n ^ r> k are sample means of the a n /^ r i.i.d. random variables C( s )> s £ D r k . Cramer's 
Theorem therefore implies that, for each k = 1, . . .j r , {S n ^ jk ,n G IN} satisfies the LDP 
with respect to P n with scaling constants a n j^ r and with the rate function 

i(z) = sup{az — c(a)} for z G M. 

This function is the lower semi continuous, convex function conjugate to the cumulant 
generating function 

c(a) = log / e ay p(dy), aeR. 
For yk G M we summarize the LDP for S n , r ^ by the heuristic notation 

Jim 1 log P n {S n>rjk ~ Vk) « -i{Vk)- 



n^oo 



This basic LDP makes use of the fact that c(a) is finite for all a G M, which follows 



from the assumed property ( |1 . 1|) of p. The Gartner-Ellis Theorem allows one to extend 
Cramer's Theorem to measures p for which c(a) is finite for a in a subset A of M that 
contains in its interior and for which lim^oo |c'(a n )] = oo whenever a n is a sequence 
in int(A) converging to a boundary point of int(A) 0. This extension is useful because 
it applies to measures p having exponential tails that arise in certain turbulence models 

For each £, W n , r (C) is piecewise constant on the macrocells D r ^. To give a heuristic 
derivation of the LDP for W n>r , we approximate a general / G L 2 (T 2 ) by a piecewise 
constant function of the form 

k=l 

Then using Cramer's Theorem for each k = 1, . . . , j r and the independence of 5^1, . . . , S n>r} j r , 
we have for all sufficiently large r 

lim llogP n {^ n . r ~/} (1.7) 

n^oo n 
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lim — log P n {W n>r . ~ y?} 
1 7r 1 

— V lim — — log P n {S n>rk ~ v^fc} 

/r ^—j <*n/ it 



-EW = - A<pi x )) dx ~ - Af( x )) dx - 

7r j^i 



This calculation makes it reasonable to expect that Wn, r satisfies a two-parameter LDP 
on L 2 (T 2 ) with the rate function 



Hf) = J(f(x))dx. (1.8) 

Here and throughout the paper the term "rate function on a space X" denotes a lower 
semi continuous function mapping X into [0, oo]. A rate function need not have compact 
level sets. 

In Section 2 we consider a natural generalization of the doubly indexed process W n , r 
taking values in an L 2 space and formulate an LDP in the strong topology on that space 



[Thm. |2.3|1 - The LDP is proved in Section 3. In Section 4 we show that the rate function 
in this LDP has compact level sets with respect to the weak topology on the L 2 space, 
though not in general with respect to the strong topology. The results in Sections 2, 
3, and 4 are derived from basic principles using relatively straightforward proofs. These 
techniques are related to those introduced in 0, which establishes an analogous two- 
parameter LDP for a class of spatialized random measures. Those random measures also 
arise in the analysis of the continuum limit for turbulence models [jl], 11 ] . The results in 
the present paper, however, are more elementary, both conceptually and technically. At 
the end of Section 4 we comment on the paper in the light of the present paper. We 
also point out, in the next to last paragraph of that section, an oversight in a proof in ||. 

In Section 5 we summarize typical applications of our main LDP stated in Theorem 
|2.3| . In particular, we state the variational principles for the microcanonical and canonical 
ensembles defined in this introduction, and we demonstrate how to obtain these princi- 
ples from a large deviation analysis of the corresponding ensembles. The solutions of the 
variational principles are called equilibrium macrostates. In Section 5 we also point out 
how the existence of equilibrium macrostates makes use of the weak compactness of the 
level sets of the rate function. With these results in hand, we then comment on the equiv- 
alence and nonequivalence of ensembles for the microcanoncial and canonical ensembles 
for which definitive results are given in 0. A complete discussion of the physical appli- 
cations of these results is contained in [H, where families of stable, steady mean flows for 
a general class of geophysical fluid dynamical models are characterized and computed. 
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2 Statement of the LDP 



In this section we formulate the LDP for a natural generalization of the random functions 
W n ,r defined in ( |1.4j) - (|1.5|) . Let (fl, J 7 , P) be a probability space and d a positive integer. 
For each r G IN, let {S n>r ,n G IN} be a sequence of random vectors mapping Q into M d 
and satisfying the LDP as n — > oo with positive scaling constants c n , r and convex rate 
function i independent of r. In other words, for each r c n>r — > oo as n — > oo; for any 
closed subset F of iR d 

limsup logPfS^ G F} < -i(F); 

and for any open subset G of M d 

liminf — logP{5 nr G G} > -i(G). 

Here i(B) denotes the infimum of i over the set B. We assume throughout that i is lower 
semi continuous and convex on M d . We do not assume that i has compact level sets, 
even though this extra property is satisfied in many applications. The setup in Section 1 
corresponds to choosing S n>r as in ( |1.5| ) and c„ iT . = a n /^ r = 2 2(n ~ r ) whenever 1 < r < n; 
otherwise, S n>r and c n)T equal 0. 

In order to give a general formulation of the LDP, we consider a Polish space A with 
metric b (a complete separable metric space) and let 9 be a probability measure on A. 
L 2 (A, 6) denotes the set of functions / mapping A into M d and satisfying 

\\f\\l= f\f\ 2 de <oo, 

J A 

where | • | denotes the Euclidean norm on M d . Let 7 r be a sequence of positive integers 
tending oo. For each positive integer n and r we introduce S n ^i, . . . , S n>r ^ r , which are 
i.i.d. copies of S n ^ r mapping Q into M d . For each r G IV we assume that A is partitioned 
into 7 r subsets D r ^, . . . , D rnr and that these sets satisfy the following condition. 

Condition 2.1. For each r G IV 

(i) 9{D r>k } = l/7 r , k = 1, . . . , j r , 

(ii) lim ^ max {di&m.(D rk )} = 0, where diam(P, rfc ) = sup xyeDrk b(x,y). 

In Section 1 we worked with Lebesgue measure on the unit torus A = T 2 , which was 
partitioned into 2 2r macrocells D r ^- 

The process whose asymptotics we wish to analyze is the doubly indexed sequence of 
random functions defined for ( G fl and x G A by 

W n>r {x) = W n>r ((,x) = J2 ln r , k {x) S n ,, k ((). (2.1) 

k=i 
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Clearly, W n>r maps Q into L 2 (A,9). In Theorem ETB] we formulate the large deviation 



bounds for W n>r with respect to the strong topology on that space. The definition of W n>r 
in ( p.l|) is more general than in (|1.4|) because here we do not assume that SVi,r,fc has the 
form ( |1 . 5| ) . 

The rate function that appears in the LDP for W n>r is defined next. 

Definition 2.2. Let i denote the rate function in the LDP for {S n ^ r ,n £ IN} on M d . 
Given f £ L 2 (A,6) define 

i(f) = [ iofde. 

Since i is nonnegative and convex, it follows that I is well-defined, nonnegative and 
convex. At the end of Section 3 we prove that I is lower semicontinuous with respect to 
the strong topology, which is the topology generated by the open balls B(f,e) = {g <E 
L 2 (A, 9) : ||/ — g\\% < s} for / £ L 2 (A, 9) and e > 0. In general, however, / does not have 
compact level sets with respect to the strong topology. This is easily seen by returning 
to the setup of Section 1, in which S njr> k is a normalized sum of i.i.d. random variables 
each distributed by p. If we choose p to be a Gaussian measure on M having mean 
and variance 1, then i(z) = \z 2 and 1(f) = HI/HI- In this case, level sets of I coincide 
with closed balls in L 2 (A,6) centered at the origin, and these sets are not compact with 
respect to the strong topology. In Section 4 we prove, in a setting midway between those 
of Sections 1 and 2, that I has compact level sets with respect to the weak topology on 
L 2 (A, 9) under the assumption that p decays at infinity at least as fast as a Gaussian. 

We now state the two-parameter LDP for W n , r , which we prove in Section 3. 

Theorem 2.3. We assume Condition [2.1| and consider L 2 (A, 9) with the strong topology. 
Then the function I given in Definition |2.2| maps L 2 (A, 9) into [0, oo] and is lower semi- 
continuous, but in general I does not have compact level sets. In addition, the sequence 
W n)1 . satisfies the two-parameter LDP on L 2 (A,9) with rate function I in the following 
sense. For any strongly closed subset F of L 2 (A, 9) 

limsup limsup — - — log P{W n>r £ F} < -1(F), 

r— >oo n— >oo ^r^n r 

and for any strongly open subset G of L 2 (A,9) 

liminf liminf log P{W n r £ G] > -1(G). 

r^oo n— >oo <-y p ' 

In our companion paper [|J we need the special case of Theorem ^7|] discussed in 
Section 1, in which the coarse-grained process W n<r is defined by ( |1.4|) -( jOD . 



3 Proof of Theorem ggj 

In this section we prove the upper and lower large deviation bounds for W njT in separate, 
but elementary steps. The following lemma is used in both steps. 



8 



Lemma 3.1. For each r the sequence {(S^i, • • • , S ni r )lr ),n G IV} satisfies the LDP on 
(iR d ) 7r with scaling constants c n ^ and the rate function 

(z/i,...,z/ 7r ) i ^i{v k ). 



Proof. The LDP is an immediate consequence of Lemmas 2.5-2.8 in fllPl , since S n>rj i, . . . , S n ^ nr 
are i.i.d. copies of S n>r and each sequence {S n , ri n G IN} satisfies the LDP on M d with 
scaling constants c n r and rate function i. ■ 

Proof of the Large Deviation Upper Bound 

Let F be a strongly closed subset of L 2 (A, 9). For r G JV we define the closed set 

F r = {fa, . . . , u 7r ) G : £ ^ fc lD r » G F j . 

We also define L 2 to be the set of / G L 2 (A,9) of the form /(x) = YX=i U k^-D rk (x) f° r 
some vi, . . . , v lr G M d . By Lemma |5TT| , since 9{D rk \ = l/j r , 

limsup — \ogP{W ntr G F} 

7~l — *■ OO J* 

= limsup — logP{(5' n)r) i, . . . , S n , r)7r ) G F r } 

< -7 r inf I — E : (^l; • • • ; ^7r) G F r 1 
I ^ r fc=i J 



= - 7r inf |jT io f d6 : f <E F f) L*} 
<- 7r inf|^o/^:/GF| 



Dividing by 7 r and sending r — > oo gives the desired large deviation upper bound. ■ 

Proof of the Large Deviation Lower Bound 

In order to prove this bound, we need to approximate arbitrary functions in L 2 (A, 9) 
by functions that are piecewise- const ant relative to the partition D r ^, k = 1, . . . , j r . This 
is carried out in the next lemma. 



Lemma 3.2. We assume Condition |2.1| . Let f be any function in L 2 (A,6). For r G M 
define 

/ r (z) = E/* r W*)> where f k = lr f fd9. 

k=l J D r , k 

Then as r — > oo, ||/ — f r \\ 2 
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Proof. For any given e > there exists a bounded Lipschitz function ip mapping A into 
M d and satisfying ||/ — </?|| 2 < e ||. Since the operator mapping / e £ 2 (A, 6 1 ) i— > f r is an 
orthogonal projection, 



Il/ r -^ll2=||(/-v) r ||2<||/-v|| 2 <e. 



(3.1) 



Hence it suffices to estimate \\(p — (p r \\2, where (p is a Lipschitz function with constant 
M < oo; that is, — </?(y)| < Mb(y,x) for all x,y E A. Because the disjoint sets D r k 

have measure 0{.D rj fc} = l/^ r [Cond. pTT|(i)1, a straightforward calculation gives 



i ni2 



£ /l Dr >-^| 2 d0 

fc=l JA 



7r 

= E 

fe=i 

< E 

fc=i 



D r 



It \ My) - <p(x)] 9{dx) 



9{dy) 



It 



D r .k 



Mb(y,x)9{dx)\ 9(dy) 



< [M ■ max {diam(D r k)} 

' fe=l,.../y r 



Sending r — > oo and using Cond. |2j](ii), we complete the proof. ■ 

Given a strongly open subset G of L 2 (A, 9), let / be any function in G and choose e > 
so that C G. Also choose N E IN such that for all r > N B(f r ,e/2) C S(/,e), 

where f r is the function defined in Lemma |3.2| . Such an N exists because of the L 2 (A, 9)- 
convergence of f r to / proved in that lemma. Define the open set 



G T 

Then for all r > N Lemma |3J] yields 

1 



{v x ,...,p 1t ) e (R d r :Y.v k l Drk {x) e B(f\e/2) 

fc=i 



lim inf ■ 

ir L "ri,', 



> lim inf 

n — >00 p 



\0gP{W n , r e G} 

1 



lim inf 

TL — >00 r\j p 



]agP{W n , r EB(r,e/2)} 
\ogP\(S n ^i, . . . , S n ^ lr ) € G>, £ } 



1 

It 

Y 7r 



.k=l 



IT k = l 

7r 

k=l Ur 



> -— inf \ K v k) : (^i, • • • , ^ 7r ) e G>, £ 

IT k = l \ ^r,fc 

io/d# = - 
10 



The last inequality in this display follows from Jensen's inequality. The preceding display 
states that 

liminf liminf — *— log P{W n r gG}> -1(f) 

r— >oo n— too <y n 

for arbitrary / G G. Taking the supremum of —1(f) over / G G yields the desired large 
deviation lower bound. ■ 



Proof that I Is Strongly Lower Semicontinuous 

We end this section by proving that I is lower semicontinuous with respect to the 
strong topology on L 2 (A,6). Namely, we show that liminf^oo I(f n ) > /(/) for any 
strongly convergent sequence f n — > f in L 2 (A, 9). 

There exists a subsequence {f nk } such that lim^-^ I(fn k ) — liminf I(f n ) and 
fn k f 0-a.s. Fatou's lemma and the lower semicontinuity of i on M d then yield 

liminf I(f n ) = liminf I (f n ) = liminf / io/ n dd 

n— >oo n k ^oo K n k ^oo J \ K 

> ( liminf iof ni d6 > [ iofdO = 1(f). 

JA n k ^oo J A 

This completes the proof of strong lower semicontinuity. ■ 



4 Weak versus Strong Topology 

In the first half of this section we prove, in the general setting of Section 2, that the 
function / in Definition |2.2| is lower semicontinuous with respect to the weak topology on 



L (A, 8) [Thm. |4.1|| . We also show, in a setting midway between those of Sections 1 and 



2, that I has compact level sets with respect to the weak topology [Thm. |4~2 |. 

The fact that / is weakly lower semicontinuous and has weakly compact level sets is 
used to establish the existence of equilibrium macrostates for both the microcanonical and 
canonical ensembles introduced in Section 1. We recall that in Section 1 A equals T 2 and 
9 equals Lebesgue measure. In the microcanonical case the equilibrium macrostates are 
characterized as solutions of the following constrained minimization problem: minimize 
/(/) over / G L 2 (T 2 ) subject to H(f) G [E — e, E + e]. Direct methods in the calculus of 
variations then assure that a minimizer exists since the functional H is weakly continuous 
on L 2 (T 2 ). Similarly, in the canonical case the equilibrium macrostates are characterized 
as solutions of the following minimization problem: minimize /(/) + (3H(f) over / G 
L 2 (T 2 ). Again, a minimizer exists by virtue of the properties of / and H with respect to 
the weak topology on L 2 (T 2 ). These applications are further discussed in Section 5 and 
in§. 

A related issue arises in the Miller-Robert theory of coherent structures in two- 
dimensional turbulence [13, TA\. In that theory the generalized enstrophy invariants 



A{uo) = / a(u) dO 



A 



are included together with the energy invariant H. A family of moment functions a 
parameterize these extra constraints; these functions may be chosen arbitrarily provided 
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certain regularity and growth conditions are satisfied [OJ. Unlike the Hamiltonian 
H, the functionals A are generally not continuous with respect to the weak topology on 
L 2 (A, 9). The classical, quadratic enstrophy A(u) = J A \uj 2 d6 provides a counterexample. 
Moreover, as the same example shows, the crucial approximation property ( |1.6| ) of H is 
not shared by the functionals A. For this reason, in the case of the Miller-Robert theory 
one must rely on a coarse-graining process at the level of empirical measures rather than 



at the level of sample means. Details are given in [[y, [J . After the proof of Theorem |4.2| , 
we comment on the connection between the main results in the present paper and those 
in p| and then point out an oversight in a proof in 0. 

I Is Weakly Lower Semicontinuous 

We work in the general setting of Section 2. Let (•, •) denote the inner product on 
L 2 (A, 9). The weak topology on L 2 (A, 9) is generated by the neighborhoods 

B(f; /!,..., f p , e) = {ge L 2 (A, 9) : | (/, f t ) 2 - (g, h) 2 \ < e, i = 1, . . . ,p} 

for / G L 2 (A, 0),p G M, fx, . . . , f p G L 2 (A, 9), and e > 0. 

Theorem 4.1. / is lower semicontinuous with respect to the weak topology on L 2 (A,9). 

Proof. Our strategy is to approximate / by a sequence of functionals for which the lower 
semi continuity is almost immediate. For r e IV and / G L 2 (A,9) define I r (f) = I{f r ), 
where f r is the approximating function given in Lemma |3.2| . We claim that /(/) = 
sup rgIV I r (f). On the one hand, Jensen's inequality ensures that 

r(/)= / iofd9 = -Y / iU [ fdff) </(/). 



On the other hand, since f r — > f strongly in L 2 (A, 9) [Lem. p.2|| , the lower limit 

limmfr(/)>/(/) 

is valid by the strong lower semicontinuity of I proved at the end of Section 3. Hence if 
we prove that each I r is weakly lower semicontinuous, the weak lower semicontinuity of 
I follows. 

Since the sum of weakly lower semicontinuous functions is weakly lower semicontinu- 
ous, it suffices to prove that for each k the function mapping / > i(j r J D f d6) is weakly 
lower semicontinuous. This is an immediate consequence of the facts that the linear map- 
ping / i— > 7 r J D f d9 is weakly continuous and that the extended real- valued function i 
is lower semicontinuous. ■ 

I Has Weakly Compact Level Sets 

We carry out this analysis in a setting generalizes that of Section 1, but is more special 
than that of Section 2. Let A be a Polish space, 9 a probability measure on A, and a n and 
7 r two sequences of positive integers tending to oo. Consider a subset C n of A consisting of 
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a n points. For each r, A is assumed to be partitioned into 7 r sets D r> x, . . . , D r ^ r satisfying 
Condition pTT| . We then define W n> r as in ( |1.4|) -( pT5|) . We also impose the additional 
condition that the measure p that defines the common distribution of ((s), s G £„, decays 
at infinity at least as fast Gaussian. 

Theorem 4.2. We assume that there exists 5 > such that 

\y\ 2 ) p(dy) < oo. (4.1) 



Then the level sets of I are compact with respect to the weak topology on L 2 (A,8). 
Proof. First, we claim that there exists D G (0, oo) such that for all z G M d 

i(z) > -Id 2 -D. 

The inequality (a,y) < -^\a\ 2 + ||l/| 2 for a and y in IR d implies that 

12 



c(a) < 



a 



25 
Hence 



+ log L exp (^ M2 ) p^ = ^ + D - 



i(z) = sup {(a, z) — c(a)} > sup -| (a, z) — — \ — D = -\z\ 2 — D. 

This establishes the claim. 

Using this estimate, we see that for any / G L 2 (A, 6) 

5 



1(f) = fiofdff>-\\f\\l-D, (4.2) 



from which it follows that for M < oo 



{/ G L 2 (A, 6) : /(/) < M} C {/ G L 2 (A, 6) : ||/|| 2 < -(M + D)} . 

Since the closed balls {/ G L 2 (A,9) : ||/|| 2 < |(M + D)} are weakly compact, the weak 
compactness of the level sets of I follows from the fact that the level sets are weakly 
closed. This is a consequence of the weak lower semicontinuity of / proved in Theorem 



Comments on Coarse- Grained Empirical Measures 

Here we comment briefly on the paper in the light of the present work. In that 
paper we proved the LDP for a general class of doubly indexed random measures. For the 
purpose of comparison with the present paper, we consider only a special case of those 
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random measures having a form analogous to ( |1 .4] ) ( ]T~5| ) . In a notation analogous to that 
in the present paper, define the coarse-grained empirical measures 

W n>r {dx x dy) = W n>r (C, dx x dy) = 6(dx) ® ^ l Dr , k ( x ) L n,r,k{C, dy), (4.3) 

k=l 

where 

L n , r 4(,dy) = — — 6 «s)(dy). (4.4) 
On/ lr s ^r> r k 

The LDP for W njT follows from Sanov's Theorem ||, just as the LDP for W n>r in the 
present paper follows from Cramer's Theorem. The rate function associated with W n>r 
is the relative entropy R{-\9 x p). As is well-known, the relative entropy is weakly lower 
semi continuous and has compact level sets with respect to the weak topology on the space 
of probability measures on A x y. 

The doubly indexed process W nir ((,x) considered in the present paper equals the 
density, with respect to 9(dx), of the mean with respect to dy of W nir ((, dxxdy). However, 
since the mapping taking a measure in two variables to the density, with respect to 8(dx), 
of the mean with respect to dy is not continuous, it is not efficient to try to prove the LDP 
for W ntr from that for W nyT . Instead, it is much better to construct the self-contained proof 
of the LDP for W n>r given in Section 3 of the present paper, using the ideas introduced 
in for the analysis of W n ^ r . 

An abstract setting analogous to the setup in Section 2 of this paper is introduced in 
Section 2 of In that generality, the purported proof in that the rate function J 
given in Definition 2.3 has compact level sets involves the following circular reasoning (see 
pages 318-319). While \i in the last display on page 318 depends on r, the r appearing in 
the first display on page 319 depends on N, which in turn depends on fi. The conclusion 
is that the proof is invalid. 

Although we cannot conclude in general that the quantity J in [Q] has compact level 
sets, this need not be a serious hindrance. Indeed, in many cases one can prove directly 
from the form of J that it has compact level sets; a number of examples are given in 
Example 2.7 in 0. The most basic of these examples is given in part (a), where the 
rate function equals the relative entropy; in this case the compactness of the level sets is 
automatic. It is this particular example that is used in the applications paper [|T|. 



5 Discussion of Applications 



In this final section we return to the setting of Section 1 and indicate how Theorem 2J3 is 
applied to characterize equilibrium macrostates for the turbulence models. As in Section 
1, we consider both the microcanoncial ensemble and the canonical ensemble, taking into 



account the energy constraint. After summarizing how Theorems |4.1| and |4.2j are used to 
prove the existence of equilibrium macrostates, we point out another interesting property 
of these macrostates; namely, the stability of the steady mean flows that the macrostates 
determine. 



14 



The theoretical tools needed to carry out this analysis are fully developed in []7| for a 
general class of statistical equilibrium models of local mean-field type. In that paper we 
derive, from underlying LDP's, variational principles for the equilibrium macrostates in 
both the microcanonical and canonical ensembles. Moreover, we give complete and defini- 
tive results concerning the equivalance of these ensembles at the level of their equilibrium 
macrostates, emphasizing the possibility of nonequivalence. 

In an applied companion paper || these general results are applied to a widely used 
geophysical model; namely, barotropic, quasi-geostrophic turbulence in a zonal channel. 
The results in || rely on the LDP stated in Theorem |2l| in the present paper for the coarse- 
grained process W n>r . For the geophysical model we find that nonequivalence of ensembles 
occurs over a wide range of the physical parameters. This surprising result is then shown to 
be related to stability conditions on the steady mean flows determined by the equilibrium 
macrostates. The main conclusions of the paper are that when nonequivalence prevails, 
equilibrium macrostates corresponding to the microcanonical ensemble are richer than 
those corresponding to the canonical ensemble and furthermore that the microcanonical 
equilibrium macrostates express the essential features of the coherent structures that form 
in geophysical fluid flows. 

We now summarize some of these results, stressing their connections to the probabilis- 
tic questions addressed in the preceding sections of this paper. 

Variational Principles for Equilibrium Macrostates 

We first consider the microcanonical ensemble Pn ,£ , where E is an admissible energy 
value; i.e., one for which the constraint set {/ G L 2 (T 2 ) : H(f) = E} is nonempty. 
For admissible E, we prove in Theorem 3.2 of J?] that the coarse-grained vorticity field 
W njr satisfies the LDP on L 2 (T 2 ) with scaling constants a n = 2 2n in the continuum limit 
n — > oo, r — > oo, e — > 0. The rate function I E for this LDP is given explicitly in terms of 
the rate function /(/) defined in ( |1.8| ); namely, 



where 



I E (f) = \ J (/) + 5 ^) if H if) = E > (5.i) 
I oo otherwise, 



S(E) = - Jrf^tffo) : H{g) = E}. (5.2) 

The quantity S(E) is called the microcanonical entropy. The LDP with respect to the 
microcanonical ensemble follows readily from the LDP, given in Theorem |2l| for W n>r 
with respect to the product measures P n . 

For / G L 2 (T 2 ) satisfying H(f) = E, we summarize the LDP with respect to P E,£ by 
the formal asymptotic statement 

P n e {W n ,r ~ /} ~ exp [-a n I E (/)] as n — > oo, r — > oo, e — > 0. (5.3) 

In this setting, L 2 (T 2 ) is the state space of the coarse-grained process, and its elements 
/ are the macrostates or coarse-grained vorticity fields. 

The set of microcanonical equilibrium macrostates is defined to be 

S E = {feL 2 (T 2 ):I E (f)=0}. 
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This set plays a central role in the theory. For any / G L 2 (T 2 ) \ E E we have I E {f) > 0. 
The formal statement ( |5.3|) suggests that the macrostate / has an exponentially small 
probability of being observed as a coarse-grained vorticity field in the continuum limit of 
the microcanonical ensemble. As a consequence, ( |5.3| ) suggests, and the LDP for W n<r 
with respect to the microcanonical ensemble allows one to prove, that the macrostates 
/ G E E are the overwhelmingly most probable coarse-grained vorticity fields compatible 
with the microcanonical constraint H(f) = E. For this reason, the macrostates in E E 
determine long-lived, large-scale coherent structures in the turbulent vorticity field, the 
prediction of which is the goal of the statistical equilibrium theory. 

Analogous results apply to the canonical ensemble P Hj p for any ft G M. In order to 
obtain the LDP, the inverse temperature ft must be scaled with a n ; the physical reason 
for this is given in 0, @- With respect to P n ,a n /3, the same coarse-grained process W n>r 



satisfies the LDP with scaling constants a n in the continuum limit n — ■> oo, r — > oo 0, 
Thm. 2.4]. The rate function Ip is given by 

Ip(f) = I(f)+ftH(f)-^ft), (5.4) 

where 

M4 I{g)+mg)} - 



Again, this LDP for W n ^ r follows readily from the LDP given in Theorem |2T3 . 
The set of canonical equilibrium macrostates is defined by 

Sp = {fe L 2 (T 2 ) : Ip(f) = 0}. 

The relationship between the set S E and the microcanonical measures P E,e is mirrored 
by the relationship between Ep and the scaled canonical measures P n ,a n p- Namely, with 
respect to the latter measures, Ep consists of the overwhelmingly most probable coarse- 
grained states. They correspond to long-lived, large-scale coherent structures within a 
turbulent vorticity field at a given ft. It is difficult, however, to justify on physical grounds 
prescribing a "turbulent temperature" 1/ft, especially when ft < 0. This negative tem- 
perature regime is nevertheless the one of most physical interest in real applications. 



Existence of Equilibrium Macrostates 

Throughout this discussion of the existence of equilibrium macrostates, we assume 
that p satisfies the decay condition given in ( f4.1| ). We first ask whether there exist 
microcanonical equilibrium macrostates for each admissible energy value E. In other 
words, for each admissible E is the set E E nonempty? Because of ( |5.1| ), determining 
the equilibrium macrostates in E E is equivalent to solving the constrained minimization 
problem 

minimize 1(f) over {/ G L 2 {T 2 ) : H(f) = E}. 

Analogously, we ask whether there exist canonical equilibrium macrostates for each given 
inverse temperature ft. In other words, is the set Ep nonempty? Because of (|5.4j), de- 
termining the equilibrium macrostates in Ep is equivalent to solving the unconstrained 
minimization problem 

minimize /(/) + ftH(f) over / G L 2 (T 2 ). 
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These variational problems are dual in the sense that the Lagrange multiplier for the 
constraint H(f) = E in the microcanonical problem is the prescribed parameter f3 in the 
canonical problem. 

In both variational problems, the existence of solutions is assured by Theorems |4.1| 
and 4.2 . With respect to the weak topology on L 2 (T 2 ), / is lower semicontinuous, its level 



sets are compact, and H is continuous. Consequently, the direct methods of the calculus 
of variations apply to the microcanonical and canonical problems, yielding the existence 
of minimizers in both cases. 

The following mean-field equation is satisfied by a solution / of either the microcanon- 
ical or canonical variational principle: 

*'(/) = ~P f 9(x-x')f(x')dx', (5.5) 

JT 2 

where, as in (|1.2j ), g{x — x') is the generalized Green's function. This equation shows 
that the most probable, coarse-grained vorticity fields / in both ensembles correspond 
to steady, deterministic flows |l(J. These first-order conditions are identical for the 
two ensembles, except that /3 is specified in the canonical ensemble while (3 is determined 
along with the solution / in the microcanonical ensemble. 



Stability of Equilibrium Macrostates 

Even though the mean-field equations satisfied by equilibria in B E and in Bp are identi- 
cal, the correspondence between these sets of minimizers is subtle. This issue is commonly 
called the equivalence of ensembles; it investigates the relationships between the set of 
solutions of the constrained minimization problem that characterizes microcanonical equi- 
librium macrostates and the set of solutions of the unconstrained minimization problem 
that characterizes canonical equilibrium macrostates. This topic is discussed in great de- 
tail in for a wide range of models that includes as a special case the model of barotropic 
quasi-geostrophic turbulence studied in M. 

As we show in 0, the equivalence or nonequivalence of ensembles depends entirely on 
concavity properties of the microcanonical entropy S, which is defined in (|5.2| ). In general, 
the microcanonical equilibrium macrostates are richer than the canonical equilibrium 
macrostates. This assertion is a consequence of the following results. Every / e Bp lies 
in B E for some E. If S is strictly concave and smooth, then the usual thermodynamic 
relation (3 = dS/dE defines a one-to-one correspondence between the two families of 
equilibrium sets. If, on the other hand, S is not concave at a value E = E*, then the 
ensembles are nonequivalent in the sense that B E * is disjoint from the sets Bp for all values 
of j3. A precise and general formulation of this striking behavior is given in Theorem 4.4 
in 0. Moreover, in Section 2 of and in Section 6 of 0, a number of examples of 
models of turbulence are given in which the microcanonical entropy is not concave over a 
substantial subset of its domain, and so the ensembles are nonequivalent. 

The question of equivalence between the microcanonical and canonical ensembles is 
intimately related to stability conditions for the equilibrium macrostates /. Such stability 
criteria are derived from the second-order conditions satisfied by a minimizer /, and 
these conditions are different for the constrained and unconstrained variational problems. 
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With respect to the canonical ensemble, the functional I + (3H itself provides a Lyapunov 
functional at / whenever / is a nondegenerate minimizer. This construction, which is 
known as Arnold nonlinear stability analysis in the deterministic context, proves that 
a perturbation of a canonical equilibrium macrostate / that is small in the L 2 norm 
remains close to / in the L 2 norm for all time. Interestingly, the strong L 2 topology 
for this stability theorem coincides with the topology for which the LDP holds in the 
statistical equilibrium theory. In this setting, the LDP can be viewed as a weak form of 
a stability statement for microstates; under an ergodic evolution of the microstates, the 
coarse-grained process remains close to the equilibrium macrostate in the L 2 norm. 

The familiar Arnold construction, however, is not adequate to prove the stability of 
microcanonical equilibrium macrostates when the ensembles are not equivalent. Nev- 
ertheless, a refined argument based on penalizing the functional / + (3H furnishes the 
needed Lyapunov functional for the microcanonical ensemble. Then an L 2 stability result 
analogous to the one mentioned for the canonical ensemble is valid. This refined stability 
analysis fills an important gap in the known stability criteria for two-dimensional flows 
and their geophysical counterparts. The reader is referred to || for a complete discussion. 

Summary 

Section 5 of this paper shows the importance, in applications, of Theorem [2.3| , which 
proves the LDP for the doubly indexed process W n . r with respect to the product measures 
P n . The special case of this process given in (|1.4j) - (|1.5|) is needed in our companion paper 
||, which gives a rather complete analysis of the equilibrium macrostates for a basic 
geophysical model. For this model Theorem ^]3| allows one to prove LDP's for the coarse- 
grained process W n>r with respect to both the microcanonical ensemble and the canonical 
ensemble. In turn, these LDP's allow one to characterize equilibrium macrostates with 
respect to both ensembles, the equivalence and nonequivalence of which are determined by 
concavity properties of the microcanonical entropy. In addition, in || stability properties 
of the equilibrium macrostates are derived, using both the familiar Arnold construction 
and an extension of this construction based on penalizing the Lyapunov functional used 
in the Arnold construction. The fundamental innovation in this work is coarse-graining, 
which, via the doubly indexed process W n>r , allows one to mediate between a microscopic 
scale on which the model is defined and a macroscopic scale on which the equilibrium 
macrostates are defined and analyzed. The LDP for W n>r given in Theorem [2.3| is the 
basic mathematical result that makes all the other analysis possible. 
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